Human Genomics — Latest Matching Preprints

1

New RDEB intermediate variant with in-frame partial exon skipping in FN-III-like domain of type VII collagen.

Evtushenko, N.; Kubanov, A.; Martynova, A.; Kondratyev, N.; Beilin, A.; Karamova, A.; Monchakovskaya, E.; Azimov, K.; Nefedova, M.; Bozhanova, N.; Zaklyazminskaya, E.; Gurskaya, N.

2022-09-04 dermatology 10.1101/2022.09.02.22278356 medRxiv

Top 0.1%

17.5%

Show abstract

Recessive Dystrophic Epidermolysis Bullosa (RDEB) is a debilitating genodermatosis caused by pathogenic mutations in the COL7A1 gene, which induce absence or reduction in the number of anchoring fibrils. The severity of RDEB depends on the mutation type and localization, but many aspects of this dependence remain to be elucidated. Here, we report a novel variant of RDEB Intermediate in two unrelated patients. Their disease manifestation includes early skin and oral mucosa blistering and is associated with localized atrophic scarring. According to the exome and Sanger sequencing results, both investigated Probands are the carriers of complex heterozygosity in the COL7A1 gene with the same deletion in intron 19 of the COL7A1 gene. RT-PCR followed by sequence analysis revealed skipping of the part of exon19, as well as the rescue of the open reading frame (ORF) of COL7A1 in both Probands. We hypothesize that the mutation in the acceptor splice site leads to the activation of the cryptic donor splice site, resulting in the truncated but partially functional protein and the milder phenotype of intermediate RDEB. This rare type of mutation expands our understanding of RDEB etiology and invites further investigation.

2

A rare non-coding enhancer variant in SCN5A contributes to the high prevalence of Brugada syndrome in Thailand

Walsh, R.; Mauleekoonphairoj, J.; Mengarelli, I.; Verkerk, A. O.; Bosada, F. M.; van Duijvenboden, K.; Poovorawan, Y.; Wongcharoen, W.; Sutjaporn, B.; Wandee, P.; Chimparlee, N.; Chokesuwattanaskul, R.; Vongpaisarnsin, K.; Dangkao, P.; Wu, C.-I.; Tadros, R.; Amin, A. S.; Lieve, K. V. V.; Postema, P. G.; Kooyman, M.; Beekman, L.; Phusanti, K.; Sahasatas, D.; Amnueypol, M.; Krittayaphong, R.; Prechawat, S.; Anannab, A.; Makarawate, P.; Ngarmukos, T.; Veerakul, G.; Kingsbury, Z.; Newington, T.; Maheswari, U.; Ross, M. T.; Grace, A.; Lambiase, P. D.; Behr, E. R.; Schott, J.-J.; Redon, R.; Barc, J.

2023-12-20 genetic and genomic medicine 10.1101/2023.12.19.23299785 medRxiv

Top 0.1%

10.5%

Show abstract

Brugada syndrome (BrS) is a cardiac arrhythmia disorder that causes sudden death in young adults. Rare genetic variants in the SCN5A gene, encoding the Nav1.5 sodium channel, and common non-coding variants at this locus, are robustly associated with the condition. BrS is particularly prevalent in Southeast Asia but the underlying ancestry-specific factors remain largely unknown. Here, we performed genome sequencing of BrS probands from Thailand and population-matched controls and identified a rare non-coding variant in an SCN5A intronic enhancer that is highly enriched in BrS cases (3.9% in cases, odds ratio 20.2-45.2) and predicted to disrupt a Mef2 transcription factor binding site. Heterozygous introduction of the enhancer variant in human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) caused significantly reduced SCN5A expression from the variant-containing allele and a 30% reduction in Nav1.5-mediated sodium-current density compared to isogenic controls. This is the first example of a validated rare non-coding variant at the SCN5A locus and partly explains the increased prevalence of BrS in this geographic region.

3

The UAE Genome Program: Unique Genetic Insights from 43,608 Individuals

Mousa, M. M.; Olbrich, M. M.; Wohlers, I. I.; Al aamri, A. A.; Alsuwaidi, A. H.; Marzouka, N. a.-d.; Alnaqbi, H. H.; Alameri, M. S.; Ruta, D. D.; Alazazi, J. J.; Magalhaes, T. T.; Mafofo, J. J.; Quilez, J. J.; Allam, M. M.; Mohamad, M. S.; Drou, N. N.; Idaghdour, Y. Y.; Hamoudi, R. R.; Tay, G. G.; Ibrahim, S. S.; Alkaabi, F. F.; Al Mannaei, A. A.; Alsafar, H. H.

2025-09-14 genetic and genomic medicine 10.1101/2025.09.12.25334546 medRxiv

Top 0.1%

10.2%

Show abstract

Here, we present a comprehensive genomic characterization of a cohort of 43,608 Emirati genomes sequenced as part of the Emirati Genome Program (EGP). This study identified more than 421 million single-nucleotide variants and indels and more than 600 million copy-number and structural variants. Small variants had 756 million molecular effects annotated. Of 7.7 million polymorphic variants having an allele frequency (AF) of more than 5% in EGP, 1,348 have a predicted deleterious effect on a protein. Characterization with respect to global variation shows that EGP represents a genetic continuum encompassing the range of African, Asian, and European populations. It is best described by two Arabian, an Eurasian, and an African component, with the predominant Arabian component linked to mitochondrial haplogroups J and T that are commonly attributed to the Middle East. Various aneuploidies of sex chromosomes were detected in 93 individuals overall, and aneuploidy of chromosome 21 was identified in 41 individuals. Median inbreeding coefficient and cumulative runs of homozygosity (ROHs) lengths were increased due to extensive consanguinity, were largest in the groups with Arabian main ancestry components, and were higher than reported for Qatar. Families were identified based on genetic relatedness and classified into 264 families with unrelated parents and 247 families with third- and fourth-degree consanguineous parents. Representative consanguineous pedigrees of families in EGP were outlined. Cumulative ROHs were affected by the main ancestry component and significantly increased in offspring of consanguineous parents, with a pronounced difference between 3rd and 4th-degree relatedness. Investigation of cumulative AFs of variants causing Mendelian diseases highlighted genes related to alpha- and beta-thalassemia (HBB, HBA2) and showed a high burden of variants causing severe recessive diseases, metabolic and retinal disorders, and hearing loss. In summary, EGP represents a landmark effort in characterizing the genetic diversity of the Emirati population, leveraging the largest Middle Eastern cohort reported to date.

4

The Iberian Roma genetic variant server; population structure, susceptibility to disease and adaptive traits.

Mavillard, F.; Perez-Florido, J.; Ortuno, F.; Valladares, A.; Alvarez-Villegas, M. L.; Roldan, G.; Carmona, R.; Soriano, M.; Susarte, S.; Fuentes, P.; Lopez-Lopez, D.; Nunez-Negrillo, A. M.; Carvajal, A.; Morgado, Y.; Arteaga, D.; Ufano, R.; Mir, P.; Gamella, J.; Dopazo, J.; Paradas, c.; Cabrera Serrano, M.

2023-08-25 genetic and genomic medicine 10.1101/2023.08.25.23294490 medRxiv

Top 0.1%

10.0%

Show abstract

The Roma are the most numerous ethnic minority in Europe. The Iberian Roma arrived in the Iberian Peninsula five centuries ago and still today, they keep a strong group identity. Demographic and cultural reasons lie behind a high rate of Mendelian disease often related to founder variants. We have analysed exome data from 119 Iberian Roma individuals collected from 2018 to 2020. A database of variant frequency has been implemented (IRPVS) and made available online. We have analysed the carrier rate of founder private alleles as well as pathogenic variants present in the general population. Significant enrichment in structural variants involving gene clusters related to keratinization and epidermal growth suggest that evolutive mechanisms have developed towards climate and environmental adaptation. IRPVS can be accessed at http://irpvs.clinbioinfosspa.es/ AUTHOR SUMMARYReference data is necessary for the correct interpretation of genetic studies. Although most genetic variants are present in all populations, ancestry has an important impact in the genetic background. For that reason databases of genetic variant in populations are developed specifically for different ethnicities, being an important tool for genetic diagnosis. The Roma are the most numerous ethnic minority in Europe. In this study we have collected samples from healthy Roma individuals from Iberian descent and implemented a database of genetic variant to facilitate genetic diagnosis in this population. Analysis of structural variants that are specific to the Iberian Roma not found in other healthy population for which genetic data are available suggest evolution towards environmental adaptation.

5

Body Composition, Cardiometabolic Risk Factors and Comorbidities in Psoriasis and the Effect of HLA-C*06:02 Status: The HUNT Study, Norway

Solvin, A. O.; Bjarko, V. V.; Thomas, L. F.; Berrospi, P.; Hveem, K.; Saunes, M.; Asvold, B. O.; Loset, M.

2022-10-08 dermatology 10.1101/2022.10.07.22280812 medRxiv

Top 0.1%

8.4%

Show abstract

Psoriasis has been associated with increased adiposity measures driving systemic inflammation, which may lead to metabolic dysfunction and comorbidities. In this population-based, cross-sectional study, we used data from 56 042 individuals in the fourth wave of the Trondelag Health Study (HUNT4), to investigate the associations between psoriasis and body composition measures assessed using bioelectrical impedance analysis, cardiometabolic risk factors, and comorbidities. Further, we investigated the associations between HLA-C*06:02 status, a potential clinical biomarker for a distinct psoriasis endotype, and these outcomes. Psoriasis was associated with increased adiposity measures, including increased body and visceral fat, and lower levels of skeletal muscle and soft lean mass, as well as higher prevalence of cardiovascular, respiratory and endocrine disorders. HLA-C*06:02-positive individuals with psoriasis had lower levels of hsCRP, increased prevalence of atrial fibrillation and decreased prevalence of migraine. Our results point to altered body composition in psoriasis with increased levels of fat, and particularly metabolically active visceral fat, and provide support for a broad clinical approach to psoriatic patients in a general population.

6

Methicillin-Susceptible Staphylococcus aureus ST398 in atopic dermatitis in Portugal displays pathogenic traits associated with impaired skin barrier function

Caieiro, D.; Faria, N. A.; Botelho, A.; Araujo, M.; Ramos, L.; Calvao, J.; Goncalo, M.; Miragaia, M.

2026-02-18 dermatology 10.64898/2026.02.17.26346495 medRxiv

Top 0.1%

8.4%

Show abstract

Staphylococcus aureus plays a central role in the exacerbation of atopic dermatitis (AD), but the population structure and pathogenic determinants of strains colonizing AD patients remain poorly understood. It is unclear whether these strains mirror those circulating in the general community or whether specific clonal lineages are selectively adapted to the AD skin microenvironment. Data addressing this question are scarce, particularly in Portugal. In this study, we investigated the molecular epidemiology and pathogenic traits of S. aureus colonizing skin lesions in adult patients with AD in Portugal. We found that lesion-associated isolates belonged predominantly to the methicillin-susceptible S. aureus MSSA-ST398 clonal type, a lineage that is widely circulating in the Portuguese community, particularly among vulnerable populations, and that has also been implicated in severe human infections. Notably, isolates from this clonal type in AD harboured specific pathogenicity traits associated with skin barrier disruption, including hemolysin and urease production, which may contribute to their success as colonizers in AD. Our findings highlight that S. aureus colonization in AD arises from a dynamic interplay between community-level molecular epidemiology and disease-specific selective pressures. While circulating lineages provide the genetic background diversity, the AD skin microenvironment appears to shape which clones ultimately become dominant. Such an integrated perspective may help to inform future geographically tailored strategies aimed at limiting bacterial burden and preventing disease exacerbation in AD.

7

Identification of Long Non-coding RNA Candidate Disease Genes Associated with Clinically Reported CNVs in Congenital Heart Disease

Penaloza, J. S.; CCVM Consortium, ; Moreland, B.; Gaither, J. B.; Landis, B. J.; Ware, S. M.; McBride, K. L.; White, P.

2024-10-02 genomics 10.1101/2024.09.30.615967 medRxiv

Top 0.1%

7.3%

Show abstract

AO_SCPLOWBSTRACTC_SCPLOWO_ST_ABSBackgroundC_ST_ABSCopy Number Variants (CNVs) contribute to 3-10% of isolated Congenital Heart Disease (CHD) cases, but their roles in disease pathogenesis are often unclear. Traditionally, diagnostics have focused on protein-coding genes, overlooking the pathogenic potential of non-coding regions constituting 99% of the genome. Long non-coding RNAs (lncRNAs) are increasingly recognized for their roles in development and disease. MethodsIn this study, we systematically analyzed candidate lncRNAs overlapping with clinically validated CNVs in 1,363 CHD patients from the Cytogenomics of Cardiovascular Malformations (CCVM) Consortium. We identified heart-expressed lncRNAs, constructed a gene regulatory network using Weighted Gene Co-expression Network Analysis (WGCNA), and identified gene modules significantly associated with heart development. Functional enrichment analyses and network visualizations were conducted to elucidate the roles of these lncRNAs in cardiac development and disease. The code is stably archived at https://doi.org/10.5281/zenodo.13799847. ResultsWe identified 18 lncRNA candidate genes within modules significantly correlated with heart tissue, highlighting their potential involvement in CHD pathogenesis. Notably, lncRNAs such as lnc-STK32C-3, lnc-TBX20-1, and CRMA demonstrated strong associations with known CHD genes. Strikingly, while only 7.6% of known CHD genes were impacted by a CNV, 68.8% of the CNVs contained a lncRNA expressed in the heart. ConclusionsOur findings highlight the critical yet underexplored role of lncRNAs in the genomics of CHD. By investigating CNV-associated lncRNAs, this study paves the way for deeper insights into the genetic basis of CHD by incorporating non-coding genomic regions. The research underscores the need for advanced annotation techniques and broader genetic database inclusion to fully capture the potential of lncRNAs in disease mechanisms. Overall, this work emphasizes the importance of the non-coding genome as a pivotal factor in CHD pathogenesis, potentially uncovering novel contributors to disease risk.

8

Single-cell Landscape of Immune Cells in Blood and Skin in Psoriasis

Deng, J.; Nordkamp, M. O.; Ye, S.; Ye, J.; Balak, D.; Yu, W.; Radstake, T.; Borghans, J. A. M.; Lu, C.; Pandit, A.; Gerritsen, B.

2024-09-21 systems biology 10.1101/2024.09.17.613463 medRxiv

Top 0.1%

6.9%

Show abstract

BackgroundPsoriasis is a systemic inflammatory disease for which there is currently no cure, in part due to an incomplete understanding of its pathophysiology. MethodsTo better understand the immune response in psoriasis, we performed single-cell RNA sequencing (scRNA-seq) on peripheral blood mononuclear cells (PBMCs) and on lesional and non-lesional skin samples from a cohort of 11 psoriasis patients and 8 healthy controls. Additionally, we conducted flow cytometry on PBMCs from a separate cohort of 13 psoriasis patients and 11 ankylosing spondylitis. FindingsOur study revealed altered immune signatures of specific myeloid and lymphocyte subsets in blood and skin, both in terms of cell numbers and gene expression. Specifically, we discovered elevated proportions of circulating CD14++ monocytes, increased expression of major histocompatibility complex (MHC) class II molecule by circulating CD16+ monocytes, as well as increased expression of genes related to skin homing and to pro-inflammatory responses in psoriasis by circulating plasmacytoid dendritic cells (pDCs). Circulating CD8+ T effector memory cells in psoriasis patients exhibited reduced abundance but increased skin-homing potential. In psoriatic lesions, we observed a hyperinflammatory myeloid-cell state and enrichment of IL17-producing cells with a tissue-resident memory T-cell signature. InterpretationThe changes in immune cell numbers and gene expression indicate a significant alteration in the immune landscape of psoriasis patients. This suggests that the immune system in psoriasis is reprogrammed, affecting both innate and adaptive branches. These findings provide new insights into the aberrant immune-cell signatures in the circulation and skin lesions in psoriasis, and thereby help to understand its pathophysiology. FundingThis study was financially supported by the National Natural Science Foundation of China (U23A6012), Science and Technology Planning Project of Guangzhou (2024A03J0055, 202206080005), Innovation Team and Talents Cultivation Program of National Administration of Traditional Chinese Medicine (ZYYCXTD-C-202204).

9

The genetic basis of dermatophytosis skin infection susceptibility

Haapaniemi, H.; Eghtedarian, R.; Tervi, A.; Estonian Biobank Research Team, ; FinnGen, ; Abner, E.; Ollila, H. M.

2024-11-26 dermatology 10.1101/2024.11.25.24317872 medRxiv

Top 0.1%

6.9%

Show abstract

Dermatophytosis is an infection caused by fungi that utilize keratinized tissues, such as skin, nails, and hair, as their energy source. This infection commonly presents as red, itchy and ring-like patches on the skin, nail thickening, or hair loss. With ever-increasing case numbers, it has become a significant public health concern estimated to affect 20 % of the worlds population. Despite the high prevalence, the genetic risk factors for dermatophytosis are poorly understood. Our goal was to elucidate the biological mechanisms underlying individual susceptibility to dermatophytosis and to explore its genetic associations with other diseases and traits. We performed a large-scale genome-wide association meta-analysis of dermatophytosis infections with over 250,000 cases and 1,370,000 controls using data from FinnGen, Estonian Biobank, UK Biobank and Million Veterans Program. We identified 30 genome-wide significant loci including seven missense variants and two variants in high linkage disequilibrium with missense variants. The strongest associations were with variants within or closest to ZNF646 (p = 6.60x10-79, beta = 0.07), HLA-DQB1 (p = 1.42x10-36, beta = 0.05), FLG (p = 1.96x10-27, beta = -0.22), FTO (p = 5.75x10-26, beta = -0.04), SLURP2 (p = 3.33x10-24, beta = 0.04) and KRT77 (p = 1.28x10-15, beta = 0.03) genes. Overall, our findings implicate keratin lifecycle and skin integrity, immune defense, and obesity as risk factors for dermatophytosis. Our findings highlight the clinical comorbidities with other skin diseases and with high BMI and identify novel genetic variants some of which are novel candidates for managing dermatophytosis infection.

10

Clinical population genetic analysis of variants in the SARS-CoV-2 receptor ACE2

Ardeshirdavani, A.; Zakeri, P.; Mehrtash, A.; Hosseini, S. M.; Li, G.; Mirtavoos-Mahyari, H.; Soltanpour, M. j.; Tavallaie, M.; Moreau, Y.

2020-05-29 genetic and genomic medicine 10.1101/2020.05.27.20115071 medRxiv

Top 0.1%

6.9%

Show abstract

PurposeSARS-CoV-2 infects cells via the human Angiotensin-converting enzyme 2 (ACE2) protein. The genetic variation of ACE2 function and expression across populations is still poorly understood. This study aims at better understanding the genetic basis of COVID-19 outcomes by studying association between genetic variation in ACE2 and disease severity in the Iranian population. MethodsWe analyzed two large Iranian cohorts and several publicly available human population variant databases to identify novel and previously known ACE2 exonic variants present in the Iranian population and considered those as candidate variants for association between genetic variation and disease severity. We genotyped these variants across three groups of COVID-19 patients with different clinical outcomes (mild disease, severe disease, and death) and evaluated this genetic variation with regard to clinical outcomes. ResultsWe identified 32 exonic variants present in Iranian cohorts or other public variant databases. Among those, 11 variants are novel and have thus not been described in other populations previously. Following genotyping of these 32 candidate variants, only the synonymous polymorphism (c.2247G>A) was detected across the three groups of COVID-19 patients. ConclusionGenetic variability of known and novel exonic variants was low among our COVID-19 patients. Our results do not provide support for the hypothesis that exonic variation in ACE2 has a sizeable impact on COVID-19 severity across the Iranian population.

11

Meta-analysis of RNA sequencing data from 534 skin samples shows substantial IL-17 effects in non-lesional psoriatic skin

Solvin, A. O.; Chawla, K.; Jenssen, M.; Olsen, L. C.; Furberg, A.-S.; Danielsen, K.; Saunes, M.; Hveem, K.; Saetrom, P.; Loset, M.

2023-11-04 dermatology 10.1101/2023.11.03.23298021 medRxiv

Top 0.1%

6.7%

Show abstract

Psoriasis is a common chronic inflammatory skin disease characterized by disturbed interactions between infiltrating immune cells and keratinocytes. To enhance our understanding of the underlying molecular and cellular mechanisms driving psoriasis pathobiology, and to identify potential biomarkers for disease severity, we conducted RNA sequencing of skin biopsies from 75 patients with psoriasis vulgaris and 46 non-psoriatic controls. To increase the robustness of the results, we meta-analysed our data with four publicly available datasets, bringing the total number of samples to 534. By comparing lesional psoriatic (PP) to healthy control (NN) skin, we identified 2269 differentially expressed genes (DEGs) (|log2FC|>1.0, FDR <0.1), and 58 DEGs when comparing non-lesional psoriatic (PN) to NN skin. We also identified 54 DEGs associated with disease severity (PASI [≥]10 vs. PASI <10). Cellular deconvolution analysis showed that differentiated keratinocytes emerged as the most prominent cell type among the DEGs in PP/NN. Functional enrichment analysis in PN/NN revealed several IL-17 related pathways and confirmed a previously reported pre-inflammatory signature across all psoriatic skin. This study provides insights into the psoriasis transcriptome and identifies a severity-specific signature, which may serve as candidate for future studies aimed at identifying psoriasis biomarkers and predicting disease progression.

12

Expanding CIRdb, a comprehensive catalog of whole-exome sequencing data of Canary Islanders

Diaz-de Usera, A.; Rubio-Rodriguez, L. A.; Munoz-Barrera, A.; Lorenzo-Salazar, J. M.; Guillen-Guio, B.; Jaspez, D.; Corrales, A.; Marcelino-Rodriguez, I.; Rodriguez-Perez, M. d. C.; Cabrera-de Leon, A.; Gonzalez-Montelongo, R.; Cruz-Guerrero, R.; Carracedo, A.; Flores, C.

2025-11-27 genetic and genomic medicine 10.1101/2025.11.24.25340885 medRxiv

Top 0.1%

6.6%

Show abstract

Within the intricate European genetic diversity landscape, Canary Islanders exhibit a unique genetic admixture, comprising European (EUR), North African (NAF), and sub-Saharan African (SSA) ancestries. This study aimed to comprehensively characterize the full spectrum of small genetic variation among 920 unrelated donors from this population based on whole-exome sequencing data to further develop CIRdb as the Canary Islanders-specific reference catalog of genetic variation. We combined this with SNP array data and whole-genome sequencing for specific analyses, revealing a total of 387,555 variants, of which 15.1% were previously unreported. Notably, 74.4% of these variants were classified as rare (with frequency <0.5%), including up to 40% of singletons. We also identified and curated a set of 2,068 variants prioritized as putative pathogenic. Intriguingly, the novel pathogenic variants exhibited enrichment in respiratory, cardiovascular, and metabolic disorders. Genetic differentiation patterns clustered separately individuals from the smallest islands, providing fine-grained insights into within-archipelago differentiation. A scan of local genetic ancestry deviations across the genome revealed an EUR ancestry enrichment around the 17q21.31 inversion, widely recognized for positive selection and associated to pleiotropic effects across pulmonary, infectious, and immunological diseases. Our results also evidenced a selective sweep shared by Canary Islanders and the NAF population around Prune Exopolyphosphatase 1 gene, which is associated with body mass index, cardiovascular health, and metabolic traits. Taken together, CIRdb presents a valuable resource of exome-wide genetic variation in a population at the edge of Southwestern European genetic diversity.

13

Genetic analyses identify shared genetic components related to autoimmune and cardiovascular diseases

Qiao, J.; Chang, M.; Chen, M.; Zhao, Y.; Hao, J.; Zhang, P.; Zhou, R.; Cai, L.; Liu, F.; Fan, X.; Pauklin, S.; Zou, R.; Li, Z.; Feng, Y.

2024-09-01 genetic and genomic medicine 10.1101/2024.09.01.24310190 medRxiv

Top 0.1%

6.4%

Show abstract

ObjectivesAutoimmune diseases (ADs) play a significant and intricate role in the onset of cardiovascular diseases (CVDs). Our study aimed to elucidate the shared genetic etiology between Ads and CVDs. MethodsWe conducted genome-wide pleiotropy analyses to investigate the genetic foundation comprehensively and shared etiology of six ADs and six CVDs. We analyze the genetic architecture and genetic overlap between these traits. Then, SNP-level functional annotation identified significant genomic risk loci and potential causal variants. Gene-level analyses explored shared pleiotropic genes, followed by pathway enrichment analyses to elucidate underlying biological mechanisms. Finally, we assess potential causal pathways between ADs and CVDs. ResultsDespite negligible overall genetic connections, our results revealed a significant genetic overlap between ADs and CVDs, indicating a complex shared genetic architecture spread throughout the genome. The shared loci implicated several genes, including ATXN2, BRAP, SH2B3, ALDH2 (all located at 12q24.11-12), RNF123, MST1R, RBM6, and UBA7 (all located at 3p21.31), all of which are protein-coding genes. Top biological pathways enriched with these shared genes were related to the immune system and intracellular signal transduction. ConclusionsThe extensive genetic overlap with mixed effect directions between ADs and CVDs indicates a complex genetic relationship between these diseases. It suggests overlapping genetic risk may contribute to shared pathophysiological and clinical characteristics and may guide clinical treatment and management.

14

A Foundational Exome Resource for Jordan: Dual Ancestry Admixture and Population-Specific Variants to Improve Clinical Variant Interpretation

Froukh, T.

2026-05-27 genetic and genomic medicine 10.64898/2026.05.23.26353895 medRxiv

Top 0.1%

6.4%

Show abstract

Currently, the genetic architecture of Middle Eastern populations is underrepresented in global genomic databases. This gap increases the rate of Variants of Uncertain Significance (VUSs) and clinical misinterpretations of genomic data especially in Middle Eastern populations. Whole exome sequencing was conducted on 90 healthy individuals from Jordan and the data were analysed using Principal Component Analysis (PCA) and multi-computational filtering. PCA revealed a double ancestry (EUR-AFR) admixture rather than a triple admixture (EUR-AFR-AMR). More than 3,500 populations-specific variants (PSVs) were identified, of which 72% were singletons. Additionally, 19 variants were significantly enriched compared to the maximum allele frequencies in public global databases (Fisher's exact test with Benjamini-Hochberg false discovery rate correction, p-value < 0.05). Consequently, the results suggest the reclassification of variants of Uncertain Significance (VUS) which reside in the ECE2 gene to likely benign and the variants of Conflicting Classification of Pathogenicity in the genes IL1RN and THPO to benign based on the significant allele frequency (AF=0.0389, p-value < 0.05). Furthermore, a pathogenic ClinVar variant was identified in a healthy individual, warranting careful interpretation. The findings underscore the importance of identifying PSVs in order to minimize or even prevent clinical misdiagnosis and highlight the unique genetic signature in Jordan. The study serves as a foundational resource for precision medicine in the region.

15

Comparison of Two Genome-Wide Association Studies for Heart Rate Response to Exercise from the UK Biobank

Thakral, A.; Paterson, A. D.

2021-07-09 genetic and genomic medicine 10.1101/2021.07.07.21259806 medRxiv

Top 0.1%

6.4%

Show abstract

The short-term changes in heart rate (HR) during and after exercise are important physiologic traits mediated via the autonomic nervous system. Variations in these traits are associated with mortality from cardiovascular causes. We conducted a systematic review of genome-wide association studies for these traits (with >10,000 participants) with the aim of comparing Polygenic Risk Scores (PRS) from different studies. Additionally, we applied the STrengthening of Reporting of Genetic Association Studies (STREGA) statement for assessing the completeness of reporting of evidence. Our systematic search yielded two studies (Verweij et al. and Ramirez et al.) that met our inclusion criteria. Both were conducted on the UK Biobank. Both defined their exercise traits as the difference between resting HR and the maximum HR during exercise. Their recovery traits were defined differently. Verweij et al. defined 5 recovery traits as the differences between the peak HR during exercise and the HRs at 10-50 sec post exercise cessation. Ramirez et al. defined their recovery trait as the difference between peak HR during exercise and the minimum HR during the minute post exercise cessation. While Ramirez et al. divided their sample into discovery and replication subsets, Verweij et al. analyzed the whole sample together. In terms of results, there were several common SNPs identified between studies and traits. There was evidence for the phenomenon of winners curse operating for a SNP from the Ramirez studys HR recovery analysis. Many of the SNPs were mutually exclusive between the studies. However, there was a good agreement of PRS from the studies. The differences in the results could be attributed to the different exclusion criteria, analytic approaches, and definitions of traits used. Both studies had an under-representation of individuals of non-European ancestry compared to those of European ancestry. Further studies with proportionate representation of individuals of all ancestries would help address this gap. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=183 HEIGHT=200 SRC="FIGDIR/small/21259806v1_ufig1.gif" ALT="Figure 1"> View larger version (55K): org.highwire.dtl.DTLVardef@134bf11org.highwire.dtl.DTLVardef@1ec23deorg.highwire.dtl.DTLVardef@26f712org.highwire.dtl.DTLVardef@26d148_HPS_FORMAT_FIGEXP M_FIG C_FIG

16

Next generation sequencing reveals NRAP as a candidate gene for hypertrophic cardiomyopathy in elderly patients

Sharma, A.; Koranchery, R.; Rajendran, R.; Mohanan, K. S.; Shenthar, J.; Perundurai, D. S.

2019-10-02 genomics 10.1101/789065 medRxiv

Top 0.1%

6.4%

Show abstract

Hypertrophic cardiomyopathy (HCM) is a genetic disorder that affects people of all ages, with the elderly population being inadequately studied. It is primarily caused by gene variants that encode proteins involved in the structure and function of the heart muscle. The identification of genes associated with elderly HCM requires ethnic-specific genomic sequences from Wellderly individuals. Currently, no Indian Wellderly dataset is available. To address this, we collected and sequenced the Indian Wellderly population. We built a novel Indian database of healthy aging nucleotide sequences (named i-DHANS) and is newly accessible at IndiCardiome. Utilizing this database and Indian HCM cohort, we identified nebulin-related anchoring protein (NRAP) as a gene associated with elderly HCM. NRAP is crucial for the assembly of myofibrils and transmission of force from the sarcomere to the extracellular matrix. Our functional analysis showed that the identified NRAP variant had significantly reduced interactions with its interacting partners, such as Kelch-like protein 41 (KLHL41) and -actinin, implying a loss of function. In summary, our findings indicate that NRAP is a new elderly cardiomyopathy gene, and our Indian Wellderly database is a valuable resource for identifying ethnic-specific genes for various diseases.

17

Genomic and transcriptomic data analyses highlight KPNB1 and MYL4 as novel risk genes for congenital heart disease

Broberg, M.; Ampuja, M.; Jones, S.; Ojala, T.; Rahkonen, O.; Kivela, R.; Priest, J.; FinnGen, ; Ollila, H. M.; Helle, E.

2022-01-08 genetic and genomic medicine 10.1101/2022.01.07.22268881 medRxiv

Top 0.1%

6.3%

Show abstract

Congenital heart defects (CHD) are structural defects of the heart affecting approximately 1% of newborns. CHDs exhibit a complex inheritance pattern. While genetic factors are known to play an important role in the development of CHD, relatively few variants have been discovered so far and very few genome-wide association studies (GWAS) have been conducted. We performed a GWAS of general CHD and five CHD subgroups in FinnGen followed by functional fine-mapping through eQTL analysis in the GTEx database, and target validation in human induced pluripotent stem cell - derived cardiomyocytes (hiPS-CM) from CHD patients. We discovered that the MYL4-KPNB1 locus (rs11570508, beta = 0.24, P = 1.2x10-11) was associated with the general CHD group. An additional four variants were significantly associated with the different CHD subgroups. Two of these, rs1342740627 associated with left ventricular outflow tract obstruction defects and rs1293973611 associated with septal defects, were Finnish population enriched. The variant rs11570508 associated with the expression of MYL4 (normalized expression score (NES) = 0.1, P = 0.0017, in the atrial appendage of the heart) and KPNB1 (NES = -0.037, P = 0.039, in the left ventricle of the heart). Furthermore, lower expression levels of both genes were observed in human induced pluripotent stem cell derived cardiomyocytes (hiPSC-CM) from CHD patients compared to healthy controls. Together, the results demonstrate KPNB1 and MYL4 as in a potential genetic risk loci associated with the development of CHD.

18

Prioritizing Cardiovascular Disease-Associated Variants Altering NKX2-5 Binding through an Integrative Computational Approach

Pena-Martinez, E. G.; Pomales-Matos, D. A.; Rivera-Madera, A.; Messon-Bird, J. L.; Medina-Feliciano, J. G.; Sanabria-Alberto, L.; Barreiro-Rosario, A. C.; Rodriguez-Rios, J. M.; Rodriguez-Martinez, J. A.

2023-09-02 genetic and genomic medicine 10.1101/2023.09.01.23294951 medRxiv

Top 0.1%

6.3%

Show abstract

Cardiovascular diseases (CVDs) are the leading cause of death worldwide and are heavily influenced by genetic factors. Genome-wide association studies (GWAS) have mapped > 90% of CVD-associated variants within the non-coding genome, which can alter the function of regulatory proteins, like transcription factors (TFs). However, due to the overwhelming number of GWAS single nucleotide polymorphisms (SNPs) (>500,000), prioritizing variants for in vitro analysis remains challenging. In this work, we implemented a computational approach that considers support vector machine (SVM)-based TF binding site classification and cardiac expression quantitative trait loci (eQTL) analysis to identify and prioritize potential CVD-causing SNPs. We identified 1,535 CVD-associated SNPs that occur within human heart footprints/enhancers and 9,309 variants in linkage disequilibrium (LD) with differential gene expression profiles in cardiac tissue. Using hiPSC-CM ChIP-seq data from NKX2-5 and TBX5, two cardiac TFs essential for proper heart development, we trained a large-scale gapped k-mer SVM (LS-GKM-SVM) predictive model that can identify binding sites altered by CVD-associated SNPs. The computational predictive model was tested by scoring human heart footprints and enhancers in vitro through electrophoretic mobility shift assay (EMSA). Three variants (rs59310144, rs6715570, and rs61872084) were prioritized for in vitro validation based on their eQTL in cardiac tissue and LS-GKM-SVM prediction to alter NKX2-5 DNA binding. All three variants altered NKX2-5 DNA binding. In summary, we present a bioinformatic approach that considers tissue-specific eQTL analysis and SVM-based TF binding site classification to prioritize CVD-associated variants for in vitro experimental analysis. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=55 SRC="FIGDIR/small/23294951v1_ufig1.gif" ALT="Figure 1"> View larger version (18K): org.highwire.dtl.DTLVardef@d12742org.highwire.dtl.DTLVardef@1687d3forg.highwire.dtl.DTLVardef@f6d7b9org.highwire.dtl.DTLVardef@1ccc18a_HPS_FORMAT_FIGEXP M_FIG C_FIG

19

Genetic differences between extreme and composite constitution types from whole exome sequences reveal actionable variations

Abbas, T.; Kutum, R.; Pandey, R.; Dakle, P.; Narang, A.; Manchanda, V.; Patil, R.; Aggarwal, D.; Bansal, G.; Sharma, P.; Chaturvedi, G.; Girase, B.; Srivastava, A.; Juvekar, S.; Dash, D.; Prasher, B.; Mukerji, M.

2020-04-28 genomics 10.1101/2020.04.24.059006 medRxiv

Top 0.1%

6.3%

Show abstract

Personalized medicine relies on successful identification of genome-wide variations that governs inter-individual differences in phenotypes and system level outcomes. In Ayurveda, assessment of composite constitution types "Prakriti" forms the basis for risk stratification, predicting health and disease trajectories and personalized recommendations. Here, we report a novel method for identifying pleiotropic genes and variants that associate with healthy individuals of three extreme and contrasting "Prakriti" constitutions through exome sequencing and state-of-the-art computational methods. Exome Seq of three extreme Prakriti types from 108 healthy individuals 54 each from genetically homogeneous populations of North India (NI, Discovery cohort) and Western India (VADU, Replication cohort) were evaluated. Fishers Exact Test was applied between Prakriti types in both cohorts and further permutation based p-value was used for selection of exonic variants. To investigate the effect of sample size per genetic association test, we performed power analysis. Functional impact of differentiating genes and variations were inferred using diverse resources -Toppfun, GTEx, GWAS, PheWAS, UK Biobank and mouse knockdown/knockout phenotype (MGI). We also applied supervised machine learning approach to evaluate the association of exonic variants with multisystem phenotypes of Prakriti. Our targeted investigation into exome sequencing from NI (discovery) and VADU (validation) cohorts datasets provide ~7,000 differentiating SNPs. Closer inspection further identified a subset of SNPs (2407 (NI) and 2393 (VADU)), that mapped to an overlapping set of 1181 genes. This set can robustly stratify the Prakriti groups into three distinct clusters with distinct gene ontological (GO) enrichments. Functional analysis further strengthens the potential pleiotropic effects of these differentiating genes/variants and multisystem phenotypic consequences. Replicated SNPs map to some very prominent genes like FIG4, EDNRA, ANKLE1, BCKDHA, ATP5SL, EXOCS5, IFIT5, ZNF502, PNPLA3 and IL6R. Lastly, multivariate analysis using random forest uncovered rs7244213 within urea transporter SLC14A2, that associate with an ensemble of features linked to distinct constitutions. Our results reinforce the concept of integration of Prakriti based deep phenotypes for risk stratification of healthy individuals and provides markers for early actionable interventions.

20

COVID-19 relevant genetic variants confirmed in an admixed population

Texis, T.; Cruz-Jaramilllo, J. L.; Garcia-Munoz, W.; Anzures-Cortes, L.; Hadadd-Talancon, L.; Sanchez-Garcia, S.; Jimenez-Martinez, M. d. C.; Perez-Barragan, E.; Nieto-Patlan, A.; Martinez-Ezquerro, J. D.; Rubio-Carrasco, K.; Rodriguez-Dorantes, M.; Cortes-Ramirez, S.; Mellado-Sanchez, G.; Perez-Tapia, S. M.; Gonzalez-Covarrubias, V.

2022-04-16 genetic and genomic medicine 10.1101/2022.04.15.22273925 medRxiv

Top 0.1%

6.2%

Show abstract

The dissection of factors that contribute to COVID-19 infection and severity has overwhelmed the scientific community for almost 2 years. Current reports highlight the role of in disease incidence, progression, and severity. Here, we aimed to confirm the presence of previously reported genetic variants in an admixed population. Allele frequencies were assessed and compared between the general population (N=3079) for which at least 30% have not been infected with SARS-CoV2 as per July 2021 versus COVID-19 patients (N=106). Genotyping data from the Illumina GSA array was used to impute genetic variation for 14 COVID-relevant genes, using the 1000G phase 3 as reference based on the human genome assembly hg19, following current standard protocols and recommendations for genetic imputation. Bioinformatic and statistical analyses were performed using MACH v1.0, R, and PLINK. A total of 7953 variants were imputed on, ABO, CCR2, CCR9, CXCR6, DPP9, FYCO1, IL10RB/IFNAR2, LZTFL1, OAS1, OAS2, OAS3, SLC6A20, TYK2, and XCR1. Statistically significant allele differences were reported for 10 and 7 previously identified and confirmed variants, ABO rs657152, DPP9 rs2109069, LZTFL1 rs11385942, OAS1 rs10774671, OAS1 rs2660, OAS2 rs1293767, and OAS3 rs1859330 p<0.03. In addition, we identified 842 variants in these COVID-related genes with significant allele frequency differences between COVID patients and the general population (p-value <E-2 - E-179). Our observations confirm the presence of genetic differences in COVID-19 patients in an admixed population and prompts for the investigation of the statistical relevance of additional variants on these and other genes that could identify local and geographical patterns of COVID-19.